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ABSTRACT 

Image-aided  navigation  techniques  can  determine  the  navi¬ 
gation  solution  (position,  velocity,  and  attitude)  by  observ¬ 
ing  a  sequence  of  images  from  an  optical  sensor  over  time. 
This  operation  is  based  on  tracking  the  location  of  station¬ 
ary  objects  in  multiple  images,  which  requires  solving  the 
correspondence  problem.  This  is  an  active  area  of  research 
and  many  algorithms  exist  which  attempt  to  solve  this  prob¬ 
lem  by  identifying  a  unique  feature  in  one  image  and  then 
searching  subsequent  images  for  a  feature  match.  The  cor¬ 
respondence  problem  is  plagued  by  feature  ambiguity,  tem¬ 
poral  feature  changes,  and  also  occlusions,  which  are  dif¬ 
ficult  for  a  computer  to  address.  Constraining  the  corre¬ 


spondence  search  to  a  subset  of  the  image  plane  has  the 
dual  advantage  of  increasing  robustness  by  limiting  false 
matches  and  improving  search  speed.  A  number  of  ad-hoc 
methods  to  constrain  the  correspondence  search  have  been 
proposed  in  the  literature. 

In  this  paper,  the  correspondence  problem  itself  is  care¬ 
fully  analyzed  from  fundamental  optical  principles.  This 
development  results  in  a  general  temporal  sampling  con¬ 
straint  and  also  reveals  the  essential  connection  between 
the  deleterious  effects  of  temporal  aliasing  and  the  ambi¬ 
guities  which  plague  the  correspondence  search  problem. 
This  temporal  image  sampling  constraint  is  expressed  as  a 
function  of  the  navigation  trajectory  for  elementary  cam¬ 
era  motions.  The  predicted  sampling  rates  are  on  the  or¬ 
der  of  those  needed  for  adaptive  optics  control  systems  and 
require  very  large  bandwidths.  The  temporal  image  sam¬ 
pling  constraint  is  then  re-evaluated  by  incorporating  iner¬ 
tial  measurements.  The  incorporation  of  inertial  measure¬ 
ments  is  shown  to  reduce  the  required  temporal  sampling 
rate  to  practical  levels,  which  evidences  the  fundamental 
synergy  between  image  and  inertial  sensors  for  naviga¬ 
tion  and  serves  as  the  basis  for  a  real-time,  adaptive,  anti¬ 
aliasing  strategy. 

INTRODUCTION 

It  is  well-known  that  optical  measurements  provide  ex¬ 
cellent  navigation  information,  when  interpreted  properly. 
Optical  navigation  is  not  new.  Pilotage  is  the  oldest  and 
most  natively  familiar  form  of  navigation  to  humans  and 
other  animals.  Mechanical  instruments  such  as  astrolabes, 
sextants,  and  driftmeters  [17]  have  been  used  to  make  pre¬ 
cision  observations  of  the  sky  and  ground  to  improve  navi¬ 
gation  performance  for  centuries. 

The  difficulty  in  using  optical  measurements  for  au¬ 
tonomous  navigation,  that  is,  without  human  intervention, 
has  always  been  in  the  interpretation  of  the  image,  a  diffi¬ 
culty  shared  with  Automatic  Target  Recognition  (ATR).  In¬ 
deed,  when  celestial  observations  are  used,  the  ATR  prob¬ 
lem  in  this  structured  environment  is  tractable  and  auto- 
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matic  star  trackers  are  widely  used  in  astro-inertial  nav¬ 
igation  systems  for  long-range  aircraft,  space  navigation, 
and  ICBM  guidance.  When  ground  images  are  to  be 
used,  the  difficulties  associated  with  image  interpretation 
are  paramount.  At  the  same  time,  the  problems  associated 
with  the  use  of  optical  measurements  for  navigation  are 
somewhat  easier  than  ATR.  Moreover,  recent  developments 
in  feature  tracking  algorithms,  miniaturization,  and  reduc¬ 
tion  in  cost  of  inertial  sensors  and  optical  imagers,  aided  by 
the  continuing  improvement  in  microprocessor  technology, 
motivates  us  to  consider  using  inertial  measurements  to  aid 
the  task  of  feature  tracking  in  image  sequences  and  realize 
a  tightly-coupled  image-aided  INS. 

The  methods  are  typically  classified  as  either  feature-based 
or  optic  flow-based,  depending  on  how  the  image  corre¬ 
spondence  problem  is  addressed.  Feature-based  methods 
determine  correspondence  for  “landmarks”  in  the  scene 
over  multiple  frames,  while  optic  flow-based  methods  typ¬ 
ically  determine  correspondence  for  a  whole  portion  of  the 
image  between  frames  using  correlation  techniques.  A 
good  reference  on  image  correspondence  is  [13].  Optic 
flow  methods  have  been  proposed  in  the  literature  generally 
for  elementary  motion  detection,  in  a  somewhat  structured 
environment  focusing  on  determining  relative  velocity  or 
angular  rates  for  obstacle  avoidance  [7], 

Feature  tracking-based  navigation  methods  have  been  pro¬ 
posed  both  for  fixed-mount  imaging  sensors  or  gimbal 
mounted  detectors  which  ’’stare”  at  the  target  of  interest, 
similar  to  the  gimballed  infrared  detector  on  some  heat¬ 
seeking  missiles.  Many  feature  tracking-based  navigation 
methods  exploit  knowledge  (either  a  priori ,  through  binoc¬ 
ular  stereopsis,  or  by  exploiting  terrain  homography)  of  the 
target  location  and  solve  the  inverse  trajectory  projection 
problem  [1, 14],  If  no  a  priori  knowledge  of  the  scene  is 
provided,  egomotion  estimation  is  completely  correlated 
with  estimating  the  scene.  This  is  referred  as  the  struc¬ 
ture  from  motion  (SFM)  problem.  A  theoretical  develop¬ 
ment  of  the  geometry  of  fixed-target  tracking,  with  no  a 
priori  knowledge  is  provided  in  [16].  An  online  (Extended 
Kalman  Filter-based)  method  for  calculating  a  trajectory  by 
tracking  features  at  an  unknown  location  on  the  Earth’s  sur¬ 
face,  provided  the  topography  is  known,  is  given  in  [5].  Fi¬ 
nally,  navigation-grade  inertial  sensors  and  terrain  images 
collected  on  a  T-38  “Talon”  are  processed  and  the  potential 
benefits  of  optical-aided  inertial  sensors  are  experimentally 
shown  in  [18], 

Many  methods  for  solving  the  correspondence  problem 
have  been  proposed  in  the  computer  vision  literature.  A 
popular  algorithm  is  the  Lucas-Kanade  feature  tracker  [12], 
which  relies  on  the  premise  of  the  invariance  of  the  in¬ 
tensity  field  between  images.  It  uses  a  template  correla¬ 
tion  algorithm  to  minimize  the  sum  of  squared  differences 
(SSD)  between  image  intensities.  The  algorithm  typically 


assumes  a  linear  (x  —  y  plane)  motion  model,  but  can  be 
extended  to  optimize  over  affine  or  bilinear  transforma¬ 
tions.  Other  feature  correspondence  algorithms  have  been 
proposed  which  are  invariant  to  rotations,  scaling  or  both, 
(e.g.,  [10])  More  robust  feature  tracking  algorithms  are 
typically  computationally  expensive  and  a  designer  must 
trade  tracking  robustness  and  accuracy  for  real-time  per¬ 
formance. 

Current  Correspondence  Constraint  Approaches 

Exploiting  inertial  measurements  to  constrain  the  corre¬ 
spondence  search  has  been  proposed  in  the  literature.  In 
this  section,  two  methods  which  exploit  inertial  measure¬ 
ments  are  discussed. 

Bhanu  and  Roberts  [3]  utilize  inertial  measurements  to 
compensate  for  rotation  between  images  and  to  predict  the 
focus  of  expansion  in  the  second  image.  Once  the  sec¬ 
ond  image  is  derotated  and  the  focus  of  expansion  is  estab¬ 
lished,  the  correspondence  between  interest  points  is  calcu¬ 
lated  using  goodness-of-fit  metrics.  One  relevant  metric  is 
the  correspondence  search  constraint  placed  on  each  point. 
This  constraint  ensures  each  interest  point  lies  in  a  cone- 
shaped  region,  with  apex  at  the  focus  of  expansion,  bisected 
by  the  line  joining  the  focus  of  expansion  and  the  the  inter¬ 
est  point  in  the  camera  frame  at  the  first  image  time.  While 
this  constraint  is  not  statistically  rigorous,  it  does  show  the 
value  of  using  inertial  measurements  to  aid  the  correspon¬ 
dence  problem. 

Strelow  also  incorporates  inertial  measurements  to 
constrain  the  correspondence  search  between  image 
frames  [20] .  This  constraint  on  the  image  search  space  is  a 
similar  concept  to  the  field  of  expansion  method  proposed 
by  Bhanu;  however,  Strelow  generalizes  the  approach 
by  exploiting  epipolar  geometry.  The  projection  of  an 
arbitrary  point  in  an  image  is  described  by  an  epipolar  line 
in  a  second  image.  All  epipolar  lines  in  an  image  converge 
at  the  projection  of  the  focus  of  the  complimentary  image. 
Combining  knowledge  of  the  translation  and  rotation 
between  images  and  the  pixel  location  of  a  candidate 
target  in  the  first  image,  a  correspondence  search  can  then 
be  constrained  to  an  area  “near”  the  epipolar  line.  This 
approach  is  illustrated  in  Fig.  1. 

Strelow’s  method  of  using  inertial  measurements  to  con¬ 
strain  the  correspondence  search  along  an  epipolar  line  is 
ad-hoc,  since  the  search  space  is  not  defined  statistically. 
This  method  could  be  improved  by  utilizing  a  stochasti¬ 
cally  rigorous  development. 

In  previous  publications,  we  have  presented  an  approach 
which  leverages  the  inertial  measurements  and  any  avail¬ 
able  terrain  information  to  predict  the  locations  and  statis¬ 
tical  uncertainty  of  features  in  a  new  image  [24,25].  Our 


Figure  1:  Correspondence  search  constraint  using  epipo- 
lar  lines.  Given  a  projection  of  an  arbitrary  point  in  an 
initial  image,  combined  with  knowledge  of  the  translation 
and  rotation  to  a  second  image,  the  correspondence  search 
can  be  constrained  to  an  area  near  the  epipolar  line.  Note 
the  epipole  can  be  located  outside  of  the  image  plane,  as 
shown  in  this  example. 


goal  in  this  article  is  to  expand  the  stochastic  constraint  the¬ 
ory  to  an  elemental  level  which  is  dependent  on  the  inherent 
optical  properties  of  the  sensor.  Analyzing  the  correspon¬ 
dence  problem  from  this  perspective  reveals  the  parallel 
nature  between  feature  correspondence  searching  and  tem¬ 
poral  sampling  theory  in  signal  processing  which  is  well- 
understood.  As  a  result,  feature  correspondence  ambiguity 
is  shown  to  be  analagous  to  temporal  aliasing.  Thereby, 
sampling  theory  can  be  used  to  predict  and  mitigate/avoid 
the  presence  of  aliasing  in  feature  space. 

In  the  next  section,  the  theory  of  image  sampling  is  devel¬ 
oped  from  first  principles,  with  particular  attention  to  the 
anticipated  issues  with  regard  to  temporal  sampling. 


Effects  of  Egomotion  on  Image  Formation 

As  discussed  in  the  previous  section,  the  recorded  image 
is  a  representation  of  the  optical  intensity  patterns  gen¬ 
erated  by  a  scene.  The  projection  function  is  a  function 
of  the  scene  itself,  the  camera  optical  properties,  and  the 
pose  (i.e.,  relative  position  and  orientation)  of  the  camera 
and  scene.  This  strong  coupling  between  camera  pose  and 
the  image  is  the  basis  for  the  rapidly  growing  research  ef¬ 
forts  dedicated  to  exploiting  images  to  determine  changes 
in  camera  pose.  In  this  section,  the  geometric  projection 
function  is  developed  using  a  pinhole  camera  model.  This 
model  will  be  used  as  a  basis  for  quantifying  the  effects  of 
egomotion  and  temporal  sampling. 

Optical  Sensor  Model 

An  optical  sensor  is  a  device  designed  to  measure  the  in¬ 
tensity  of  optical  energy  (light)  entering  the  sensor  through 
an  aperture.  Imaging  sensors  consist  of  an  array  of  light- 
sensitive  detectors  which  create  a  two-dimensional  light  in¬ 
tensity  measurement  (i.e.,  image).  In  this  section,  the  basic 
physical  properties  of  an  optical  sensor  are  presented,  and 
a  model  representing  an  optical  sensor  is  given. 

For  the  purposes  of  this  discussion,  the  world  is  defined  as 
a  collection  of  all  real  objects.  Some  objects  are  sources  of 
radiometric  illumination  or  radiance.  These  light  sources 
illuminate  the  world  and  interact  with  the  other  physical 
objects  through  various  types  of  reflection.  The  amount 
of  light  along  a  certain  direction  is  defined  as  the  irra- 
diance  [13],  The  physical  irradiance  pattern  entering  the 
aperture  of  the  optical  sensor  is  defined  as  the  scene  and  is 
represented  by  a  continuous  array  of  nonnegative  real  num¬ 
bers,  o(x,y,t),  projected  onto  the  image  plane.  For  the 
purposes  of  this  discussion,  the  irradiance  sources  are  con¬ 
strained  to  an  arbitrary,  piecewise  continuous,  Lambertian 
surface  in  three  dimensions. 


GENERAL  IMAGE  SAMPLING  PROBLEM 

The  mathematical  relationships  governing  spatial-temporal 
sampling  are  developed  from  basic  optical  and  sampling 
theory.  This  development  provides  a  theoretical  basis 
which  is  used  to  develop  temporal  sampling  constraints  in 
subsequent  sections. 

Image  Sampling  Considerations 

A  digital  imaging  device  is,  in  essence,  a  sampler  of  light 
intensity  patterns  in  three  dimensions:  two  spatial  and  one 
temporal.  Analyzing  the  effects  of  the  sampling  process  on 
image  sequences  resulting  from  camera  motion  with  due 
regard  given  to  the  motion’s  dynamics  has  very  important 
implications  on  how  to  properly  interpret  image  sequences 
to  derive  navigation  information. 


A  digital  optical  imaging  sensor  consists  of  an  aperture, 
lens,  detector  array,  and  sampling  array.  A  simple  imaging 
system  model  is  shown  in  Figure  2.  The  lens  focuses  the 
scene  on  the  detector  array.  The  light  pattern  focused  on 
the  detector  array  is  defined  as  the  image  and  represented 
by,  i(x,y,t).  In  statistical  terms,  the  image  is  the  mean 
photon  arrival  rate,  and  is  defined  by  a  Poisson  distribu¬ 
tion  [4] .  The  detector  array  converts  the  light  energy  into  a 
voltage  or  a  charge  which  is  converted  to  a  digital  value  by 
the  sampling  array.  The  sampling  array  is  assumed  to  be  a 
square  grid,  although  other  patterns  can  be  designed  (e.g., 
honeycomb)  [8]. 

The  lens  is  an  analog  low-pass  filter  in  the  spatial  do¬ 
main,  with  a  cutoff  frequency  ( fc )  determined  by  the  aper¬ 
ture  ( D ),  wavelength  of  light  source  (A),  and  focal  length 
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Figure  2:  Digital  imaging  system.  The  imaging  system 
transforms  the  scene  into  a  digital  image.  The  major  com¬ 
ponents  of  the  camera  are  the  optics,  light  detector,  ampli¬ 
fier,  and  analog  to  digital  converter. 


of  the  camera  (/o)  [4]: 

1 

m 


(1) 


Thus,  a  scene  consisting  of  a  point  source  of  light  (delta 
function  intensity)  would  appear  slightly  blurred  (spread) 
on  the  image  plane.  Assuming  spatial  invariance,  this  blur¬ 
ring  due  to  the  lens  is  represented  by  the  point  spread  func¬ 
tion,  h(£,  p),  where  £  and  p  are  the  spatial  differences  in 
the  x  and  y  directions,  respectively.  The  image  in  the  spa¬ 
tial  domain  can  now  be  expressed  mathematically  as  the 
convolution  of  the  scene  and  point  spread  function  [6] 


i (x,y,t)=  /  o(£,p,t)h(x-t,y-p)dpd£  (2) 

Jfex  JpGY 


The  image  is  physically  continuous  in  space  and  time.  This 
continuous  function  of  three  variables  is  then  sampled  and 
converted  to  an  array  of  (digital)  numbers.  Concerning  the 
sample  process,  the  light  energy  in  the  image  is  integrated 
in  each  pixel  over  a  temporal  period  defined  as  the  dwell 
time  (At).  The  sampled  image  (i s(m,  n ,  i))  is  obtained  for 
integer  pixel  location  (m,  n)  and  sample  time,  i,,  as 

i  s(m,n,ti)  = 


pti~\-At/2  nm-\- Ax/2  /»n+Ay/2 


I  ti—At/2  J  m— Ax/2  Jn—Ay/2 


i  (x,y,ti)dxdydt  (3) 


Analyzing  the  image  sampling  process  from  a  frequency 
domain  perspective  provides  insights  into  the  rigorous  use 
of  images  for  navigation  purposes.  As  previously  men¬ 
tioned,  the  camera  optics  act  as  an  analog  low-pass  filter, 
characterized  by  the  time-invariant  point  spread  function, 
h(£,  p).  The  frequency  domain  representation  of  the  point 
spread  function,  H (fx,fy),  is  called  the  optical  transfer 
function.  Applying  the  Fourier  transform  to  the  image 
equation  (2),  shows  the  multiplicative  relationship  for  the 


Figure  3:  Effects  of  camera  optics  on  image  spatial  fre¬ 
quency.  The  camera  optics  act  as  a  low-pass  filter  with  a 
cutoff  frequency  of  fc.  The  scene,  which  is  wideband,  ap¬ 
pears  as  a  band-limited  image  on  the  detector  array. 


spatial  frequency  domain  representation  of  the  image: 

I(fXJy,t)  =  0(fXJy,t)H(fX,fy)  (4) 

In  most  conditions,  the  projected  scene  can  be  treated  as  a 
wideband  function  relative  to  the  optical  transfer  function, 
(i.e.,  fc3CBrlB  fcoTF  )■  This  results  in  the  following  spatial 
frequency  limitation  of  the  projected  image 

I(/*,/w,t)=0,  V  |/a|,|/w|  >fc  (5) 

This  relationship  is  expressed  graphically  in  Figure  3. 


The  sampling  operation  can  be  represented  by  a  zero-order 
hold  (or  sample-and-hold)  process  in  the  spatial  domain 
and  as  a  natural  sampling  process  in  the  time  domain 
(see  [19]).  The  resulting  frequency  spectrum  for  the  sam¬ 
pled  image  consists  of  .svnc-weighted  copies  of  the  image 
frequency  response,  located  at  integer  multiples  of  the  spa¬ 
tial  sampling  frequency.  An  illustration  is  shown  in  Fig¬ 
ure  4.  Hence,  to  prevent  spatial  aliasing,  that  is,  to  avoid  the 
“difffraction  limit”,  the  spatial  sampling  rate  must  satisfy 
the  spatial  Nyquist  condition  in  both  dimensions,  which  is 
determined  by  the  camera  optics  as: 

fxJy>2fc  =  2^~  (6) 


where  and  fy  are  the  spatial  sampling  rates  in  the  x  and 
y  directions,  respectively.  These  are  directly  related  to  the 
physical  pixel  size  as: 


fx 

fy 


1 

Ax 

1 

Ay 


(7) 

(8) 


where  Ax  and  Ay  are  the  pixel  sizes  in  the  x  and  y  direc¬ 
tions. 


Camera  motion  changes  the  projection  of  a  stationary  scene 
which,  for  a  simple  point  illumination  source,  results  in  an 
apparent  image  “shift”.  This  image  shift  results  in  a  mod¬ 
ulation  of  the  frequency  content  in  the  temporal  frequency 


POWER  SPECTRAL  DENSITY 


Figure  4:  Spatial  Sampling  Illustration.  The  spatial  sam¬ 
pling  process  creates  sine -weighted  spectral  copies  in  the 
spatial  frequency  domain  when  square  pixels  are  used.  The 
sampling  frequency,  fs,  must  be  greater  than  twice  the  cut¬ 
off  frequency,  fc,  to  eliminate  spatial  aliasing. 


domain.  More  specifically,  the  velocity  of  a  point  source 
in  the  image  plane,  s^roj'),  results  in  the  following 

Nyquist  temporal  sampling  constraint 

ft  >2max{sr,,r}^  (9) 


Assuming  square  pixels  which  are  sized  according  to  the 
spatial  Nyquist  sampling  (i.e.,  fx  =  fy  =  2  fc)  results  in 
the  following  pixel  size 

A  pixel  =  (10) 

Substituting  Eqn.  (10)  into  (9)  results  in  the  normalized 
temporal  sampling  constraint 


ft> 


max  {spxrof  spyroj) 


A 


pixel 


(ii) 


As  a  result,  to  minimize  temporal  aliasing,  the  Nyquist  rate 
can  be  achieved  by  ensuring  no  feature  moves  more  than 
one  half  of  the  minimum  distance  between  intensity  peaks 
in  the  image  plane.  Given  an  optical  cutoff  frequency  of 
fc,  the  temporal  sampling  interval,  Ts,  should  be  chosen 
such  that  the  maximum  image  shift  due  to  camera  motion 
is  less  than  A  .  This  implies  a  fundamental  interrelation- 
ship  between  the  minimum  spatial  and  temporal  sampling 
intervals,  which  is  somewhat  similar  to  the  spatial-temporal 
discretization  constraint  found  when  solving  the  heat  PDE, 
also  known  as  the  von  Neumann  condition. 

In  the  next  section,  a  mathematical  model  describing  the 
relationship  between  point  locations  in  the  world  and  image 
will  be  derived.  The  resulting  projection  equations  will  be 
used  to  calculate  appropriate  temporal  sampling  intervals, 
based  on  scene  geometry  and  camera  motion. 


Egomotion  Effects  on  Temporal  Sampling 

In  the  previous  section,  the  effects  of  egomotion  on  the  for¬ 
mation  of  the  image  are  presented.  In  this  section,  the  ego¬ 
motion  effects  on  temporal  sampling  are  illustrated.  Re¬ 
ducing  the  spatial  dimensionality  of  the  problem  from  two 
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Figure  5:  Thin  lens  camera  model.  The  thin  lens  model 
directs  parallel  light  rays  toward  the  focus,  resulting  in  an 
image.  Figure  is  not  to  scale. 


to  one  is  performed  in  order  to  illustrate  the  effects  of  ego¬ 
motion  on  temporal  sampling  in  a  manner  that  is  easier  to 
visualize. 


Projection  Theory 


The  camera  optical  properties  define  the  relationship  be¬ 
tween  the  scene  and  the  projected  image.  Recalling  the 
simple  camera  model  (Figure  2),  the  lens  focuses  the 
incoming  irradiance  pattern  (i.e.,  scene)  onto  the  image 
plane.  For  a  theoretical  thin  lens,  the  projection  is  a  func¬ 
tion  of  the  focal  length  of  the  lens  and  the  distance  from  the 
lens,  as  shown  in  Figure  5.  This  relationship  is  expressed 
by  th  e  fundamental  equation  of  the  thin  lens  [13]: 


1  1  _  1 
Z+~z~J0 


(12) 


where  Z  is  the  distance  from  the  object  to  the  lens,  z  is 
the  distance  from  the  lens  to  the  image  plane,  and  /o  is  the 
focal  length. 


As  the  aperture  of  the  thin  lens  decreases  to  zero,  the  sys¬ 
tem  can  be  modeled  as  a  pinhole  camera  (see  Figure  6).  In 
this  model,  all  incoming  light  must  pass  through  the  opti¬ 
cal  center  and  is  projected  on  an  image  plane  located  at  a 
distance  /  from  the  lens.  The  resulting  image  is  an  inverted 
projection  of  the  scene. 


This  model  can  be  further  simplified  by  placing  a  virtual 
image  plane  in  front  of  the  optical  center,  as  shown  in  Fig¬ 
ure  7.  Given  a  point  source  at  location  sc  the  resulting  loca¬ 
tion  of  the  point  source  on  the  image  plane,  relative  to  the 
optical  center  of  the  camera,  is  given  by 

sproj  =  ^^jsc  =  f0sc  (13) 

where  scz  is  the  distance  of  the  point  source  from  the  optical 
center  of  the  camera  in  the  zc  direction.  The  underline  in- 
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Figure  6:  Pinhole  camera  model.  The  pinhole  camera  is 
a  theoretical  camera  model  where  a  thin  lens  aperture  ap¬ 
proaches  zero.  The  projected  image  is  inverted  on  the  im¬ 
age  plane. 


Figure  7:  Camera  projection  model.  The  pinhole  camera 
model  is  modified  by  placing  a  virtual  image  plane  one  fo¬ 
cal  length  in  front  of  the  optical  center.  As  a  result,  this 
model  eliminates  the  image  inversion  present  in  the  stan¬ 
dard  pinhole  camera  model. 


dicates  a  vector  expressed  in  homogeneous  notation,  which 
is  given  by: 


In  order  to  interpret  the  calculated  projection  in  a  digital 
image,  the  physical  image  plane  coordinates  must  be  con¬ 
verted  to  a  coordinate  system  based  on  pixel  location.  The 
following  development  defines  the  pixel  coordinate  sys¬ 
tem  and  derives  the  transformation  from  the  physical  im¬ 
age  plane  to  pixel  location.  The  image  plane  consists  of 
an  (M  x  N )  grid  of  rectangular  pixels  with  height  H  and 
width  IT',  shown  in  Figure  8.  The  origin  of  the  projection 
frame  is  located  at  the  physical  center  of  the  array.  The 
origin  of  the  pixel  coordinate  system  is  located  beyond  the 
upper  left  corner  of  the  array,  such  that  the  center  of  the  up¬ 
per  left  pixel  corresponds  to  the  (1,1)  pixel  coordinate.  This 
definition  of  pixel  coordinates  corresponds  to  the  elemen- 


pix 


H 


Figure  8:  Camera  image  array.  The  camera  imager  con¬ 
sists  of  an  (AT  x  N)  array  of  pixels.  The  physical  height 
and  width  of  the  array  are  represented  by  H  and  W,  re¬ 
spectively. 


tal  matrix  locations  when  the  image  is  stored  in  a  computer. 
This  can  be  expressed  as  a  two-element  vector 

spix  =  “  (15) 

where  u  and  v  are  the  row  and  column  corresponding  to  the 
pixel  of  interest. 


The  transformation  from  the  projection  coordinates  to  pixel 
coordinates  is  given  by: 


splx  = 


0 

i 

Ay 


0 

0 


r  M+l  1 

sproj  _|_ 

2 

AT+1 

L  2  J 

(16) 


where  A  a:  and  Ay  are  the  sizes  of  the  pixels  in  the  x  and  y 
directions,  respectively,  which  are  defined  as: 


Ax 

Ay 


H 

M 

W 

77 


(17) 

(18) 


Combining  Eqs.  (13)  and  (16)  and  expressing  the  pro¬ 
jected  pixel  location  vector  using  homogeneous  coordi¬ 
nates  yields  the  following  affine  transformation  from  cam¬ 
era  frame  to  pixel  location: 


spix 


fo 

Ax 

0 


0 

_fo_ 

Ay 


M+l 

2 

N+l 

2 


rjtpiXgC 


(19) 

(20) 


A  transformation  from  a  landmark  location  in  navigation 
frame  coordinates  to  pixel  coordinates  can  now  be  derived 


Figure  9:  Target  to  image  transformation  geometry.  The 
relationship  between  the  camera  position,  (p),  and  target 
location,  ft),  can  be  expressed  in  pixel  coordinates  using 
transformations  based  on  the  navigation  state  and  camera 
parameters. 


The  time-derivative  of  the  camera  frame  line-of-sight  vec¬ 
tor  is  given  by 

sc  =  Ccnnncn  [■ fc"  -  p"]  +  C°  [t"  -  p"]  (27) 

where  is  the  skew-symmetric  form  of  the  angular  rate 
of  the  camera  to  the  navigation  frame,  expressed  in  the  nav¬ 
igation  frame.  The  skew-symmetric  form  is  defined  in  [23], 
Expressing  the  rotations  in  the  camera  frame  yields  the  fol¬ 
lowing  equivalent  form: 

SC  =  —^ncs°  +  C°  [t"  —  pra]  (28) 

Analysis  of  Eqn.  (28)  shows  that  the  change  in  line-of-sight 
vector  is  a  function  of  both  the  camera  rotation  and  relative 
translational  motion  between  the  camera  and  landmark  of 
interest. 

In  many  cases,  the  landmark  motion  relative  to  the  naviga¬ 
tion  frame  is  insignificant  and  can  be  neglected.  Applying 
this  assumption  and  coordinitizing  the  camera  translational 
motion  in  the  camera  frame  yields 


based  on  the  navigation  state.  The  geometry  is  shown  in 
Figure  9.  The  line  of  sight  vector,  s,  is  the  vector  differ¬ 
ence  between  the  target  location  t  and  the  camera  posi¬ 
tion,  which  are  both  available  in  navigation  frame  coordi¬ 
nates  (n). 

s"  =  tn  -  pn  (21) 

The  resultant  vector  can  be  transformed  to  the  camera  refer¬ 
ence  frame  using  the  navigation-to-camera  frame  direction 
cosine  matrix: 

sc  =  Ccnsn  (22) 

Finally,  the  pixel  location  is  calculated  using  Eqn.  (20). 

Apparent  Pixel  Motion  Calculations 

The  previous  development  is  extended  to  illustrate  the  ap¬ 
parent  pixel  motion  of  a  point  feature  due  to  relative  mo¬ 
tion.  The  development  begins  by  recalling  the  camera-to- 
pixel  transformation  shown  in  Eqs.  (13-22). 

spix  =  c  =  TPjxsc/sc  (23) 

where  the  camera  frame  line  of  sight  vector,  sc,  is  given  by 
sc  =  [tn  -  pn]  (24) 


The  apparent  pixel  motion  is  derived  by  taking  the  deriva¬ 
tive  of  splx  with  respect  to  time: 

spix  =  Tpixkc  (25) 


where 


s,s  —  s  s. 


(*sr 


(26) 


=  -  vc 


(29) 


where  vc  is  the  velocity  of  the  camera,  relative  to  the  nav¬ 
igation  frame,  coordinitized  in  the  camera  frame.  Com¬ 
bining  Eqs.  (25),  (26)  and  (29)  results  in  the  well-known 
optical  flow  equation  [21]: 


Substituting  Eqn.  (29)  into  Equations  (26)  and  (25)  results 
in  the  apparent  pixel  motion  in  the  x  and  y  directions: 


u  =  — 


v  = 


which  is  expressed  using  the  scalar  components  of  the  rota¬ 
tion,  velocity,  and  line-of-sight  vectors  referenced  in  Equa¬ 
tion  (29). 


The  temporal  sampling  constraint  proposed  in  the  previous 
section  indicates  that  it  is  desirable  to  sample  such  that  the 
apparent  pixel  motion  is  limited  to  no  more  than  one  pixel 
of  change  per  image  in  both  the  x  and  y  spatial  dimensions, 
provided  the  image  is  sampled  at  spatial  Nyquist  frequency. 
Given  a  sample  interval,  Ts,  the  maximum  pixel  motion 
component,  Kmax  can  be  approximated  by 

Kmax  =  max {|u|Ts,  |u|Ts}  <  1  (32) 


In  the  next  section,  the  derived  apparent  pixel  motion  is 
analyzed  for  a  representative  scenario  which  illustrates  the 
difficulty  in  achieving  samples  from  traditional  imaging 
systems  which  do  not  violate  the  temporal  sampling  con¬ 
straints  presented  above. 

ILLUSTRATIVE  CASE  STUDY 

In  this  section,  the  apparent  pixel  motion  is  calculated  for 
a  selection  of  representative  imaging  scenarios.  As  previ¬ 
ously  developed,  the  generalized  sampling  characteristics 
of  a  given  imaging  sensor  is  a  function  of  a  number  of  pa¬ 
rameters.  In  this  scenario,  we  will  assume  that  the  cam¬ 
era  intrinsic  parameters  (i.e..  Ax,  Ay,  fo,  D,  and  A)  are 
fixed  in  such  a  way  to  guarantee  proper  spatial  sampling. 
For  this  case,  we  are  interested  in  the  resulting  temporal 
sampling  rate  ( ft )  which  is  consistent  with  the  temporal 
sampling  constraints  derived  in  the  previous  section.  The 
camera  intrinsic  parameters  are  chosen  to  be  representative 
of  currently  available  machine-vision  cameras.  These  pa¬ 
rameters  are  shown  in  Table  1 . 


Table  1:  Camera  Intrinsic  Parameters.  The  camera  intrin¬ 
sic  parameters  are  chosen  to  be  representative  of  currently 
available  machine  vision  cameras  and  are  chosen  to  elimi¬ 
nate  spatial  aliasing. 


Description 

Parameter 

Value 

(Units) 

Wavelength 

A 

550 

pm 

Focal  length 

fo 

6 

mm 

Lens  Aperture 

D 

6/16 

mm 

Vertical  Image  Size 

M 

1024 

pixels 

Vertical  Pixel  Size 

Ax 

4.4 

pm 

Horizontal  Image  Size 

N 

1280 

pixels 

Horizontal  Pixel  Size 

Ay 

4.4 

pm 

The  first  case  study  is  a  simple  5—  horizontal  pan,  with 
no  translational  motion.  The  resulting  motion  parameters 
for  this  condition  are  as  follows: 


v 


C 


OJ 


c 

nc 


o 

0 

0 


0 

0 


(33) 

(34) 


Substituting  these  motion  parameters  and  the  intrinsic  cam¬ 
era  parameters  into  Eqs.  (30)  and  (31)  yields 


K„ 


6  mm 

- — - 1  s  max 

4.4pm 


.  (FF 

57T 

180 


(35) 


As  evident  in  Eqn.  (35),  the  pixel  motion  is  primarily  a 
function  of  the  camera  motion  with  second-order  effects 
related  to  the  position  of  the  point  source  within  the  image. 
The  worst-case  condition  occurs  at  the  extreme  extents  of 
the  image.  Substituting  these  conditions  into  (35)  resolves 
the  maximization  ambiguity 

O  TT)  TO  iTT 

Kmax  =  - Ts  max  {0.1762, 1.2203}  —(36) 

4.4pm  180 

=  145.2TS  (37) 

Applying  the  temporal  sampling  constraint  and  solving  for 
Ts  yields: 

T,  <  jjE  (see)  (38) 

which  results  in  a  minimum  frame  rate  of  145.2  Hz  and, 
consequently,  a  maximum  exposure  time  of  6.9  ms. 


In  the  next  example,  the  effects  of  translational  motion  are 
investigated.  Here,  the  camera  is  moving  at  300  meters  per 
second  with  a  fixed  orientation.  The  distance  to  the  terrain 
is  10,000  meters,  which  represents  a  high-altitude  cruise 
profile  for  an  aircraft.  The  resulting  motion  parameters  for 
this  condition  are  as  follows: 


v 


C 


OJ 


c 

nc 


300 

0 

0 

0  ' 

0 

0 


/m\ 

(39) 

V  s  J 

rad\ 

sec  J 

(40) 

Substituting  these  motion  parameters  and  the  intrinsic  cam¬ 
era  parameters  into  Eqs.  (30)  and  (31)  yields 


Kn 


Kn 


6  mm 


4. 4 pm 
=  40.9  T. 


Ts  max 


f  300f 


10000m 


0 


(41) 

(42) 


Applying  the  temporal  sampling  constraint  and  solving  for 
Ts  yields: 


F  (“c) 


(43) 


which  results  in  a  minimum  frame  rate  of  40.9  Hz  and  max¬ 
imum  exposure  time  of  24.4  ms. 


The  final  example  represents  the  conditions  expected  dur¬ 
ing  a  low-level,  high  speed  dash  profile.  As  in  the  previous 
example,  the  camera  is  moving  at  300  meters  per  second 
with  a  relatively  fixed  orientation.  However  in  this  case, 
the  distance  to  the  terrain  is  reduced  to  300  meters.  The  re¬ 
sulting  motion  parameters  for  this  condition  are  as  follows: 


v 


C 


OJ 


c 

nc 


300 

0 

0 

0  ' 

0 

0 


/m\ 

V  s  J 

(44) 

rad\ 

(45) 

sec  J 

Substituting  these  motion  parameters  and  the  intrinsic  cam¬ 
era  parameters  into  Eqs.  (30)  and  (31)  yields 


6  mm 

( 300—  ) 

K 

J-  V  jyiCLX 

=  — - Ts  max  •! 

„  *  ,0^ 

(46) 

AAfim 

[  300m  J 

K 

X^max 

=  1363. 6TS 

(47) 

Applying  the  temporal  sampling  constraint  and  solving  for 
Ts  yields: 


T„  < 


1 


1363.6 


(sec) 


(48) 


which  results  in  a  minimum  frame  rate  of  1363.6  Hz  and 
maximum  exposure  time  of  733  /i,v. 


These  case  studies  illustrate  the  frame  rates  required  to 
sample  at  the  Nyquist  frequency.  In  general,  the  desired 
frame  rates  are  not  readily  attainable  using  common  hard¬ 
ware  and  lighting  conditions.  In  many  current  correspon¬ 
dence  search  schemes,  (e.g.  [11],  [9],  [12]),  the  Nyquist 
sampling  frequency  for  point  sources  is  simply  ignored  and 
the  search  scheme  seeks  so-called  “strong”  features  which 
are  consistent  between  frames  and  geometrically  consistent 
within  a  collection  of  other  features  (e.g.,  RANSAC).  It  is 
our  assertion  that  these  feature  extraction  and  correspon¬ 
dence  techniques  are  effectively  applying  low-pass,  anti¬ 
aliasing  filters  which  eliminate  the  higher  frequency  com¬ 
ponents  which  are  corrupted  by  temporal  aliasing. 


As  mentioned  previously,  there  is  a  strong  coupling  be¬ 
tween  changes  in  camera  pose  and  the  apparent  pixel  mo¬ 
tion.  In  the  next  section,  measurements  from  an  inertial 
sensor  are  used  to  mitigate  the  effects  of  temporal  aliasing. 


INCORPORATION  OF  INERTIAL  SENSOR  MEA¬ 
SUREMENTS 


ary,  the  apparent  pixel  motion  (Eqns.  30  and  31)  was  a 
function  of  the  camera  rotation  rate  and  velocity  with  re¬ 
spect  to  the  navigation  frame  and  the  relative  location  of 
the  landmark.  Strapdown  inertial  sensors  measure  both  the 
angular  rotation  increment,  A0'c,  and  specific  force  incre¬ 
ment,  Avc,  with  respect  to  the  inertial  reference  frame. 
When  combined  with  knowledge  of  the  gravity  vector, 
kinematic  equations  can  be  used  to  estimate  the  position, 
velocity,  and  attitude  of  the  sensor.  The  inertial  measure¬ 
ment  errors,  initial  navigation  state  uncertainty,  and  errors 
in  the  gravity  model  all  contribute  to  the  inevitable,  unsta¬ 
ble  error  growth  experienced  by  all  unaided  strapdown  in¬ 
ertial  navigation  systems.  A  thorough  development  of  these 
properties  can  be  found  in  [22], 

While  all  inertial  navigation  systems  experience  unstable 
error  growth  over  time,  the  relatively  short  durations  be¬ 
tween  images  allow  us  to  model  the  errors  between  succes¬ 
sive  images  using  a  simpler  model.  The  first  approximation 
assumes  that  the  navigation  reference  frame  is  effectively 
an  inertial  reference  frame  over  the  short  term.  The  second 
approximation  assumes  a  general  knowledge  of  the  naviga¬ 
tion  state  (e.g.,  the  system  is  reasonably  aligned)  such  that 
any  errors  in  the  navigation  state  itself  do  not  dominate  the 
pixel  motion  prediction  between  frames. 

Thus,  the  simplified  inertial  sensor  model  represents  the 
measurement  as  the  sum  of  the  true  value  plus  an  error  and 
is  given  as 


^nc  —  Unc  +  ^Wnc  (49) 

vc  =  vc  +  <5vc  (50) 


As  shown  in  the  previous  section,  non-aliased  temporal 
sampling  can  require  relatively  high  frame  rates,  even  for 
relatively  simple  imaging  scenarios.  High  frame  rates  can 
present  a  number  of  challenges  for  a  given  imager,  includ¬ 
ing  high  communication  bandwidth  requirements  and  short 
exposure  times,  requiring  more  sensitive  (and  expensive) 
sensors.  We  propose  to  exploit  the  information  provided 
by  inertial  sensors  in  order  to  reduce  the  image  sampling 
rates  required  to  deliver  anti-aliased  measurements.  The 
development  of  the  aided  sampling  theory  is  presented  as 
follows. 

Inertial  sensors  can  provide  three-dimensional  measure¬ 
ments  of  both  angular  rate  and  specific  force  (i.e.,  the  sum 
of  acceleration  with  respect  to  inertial  and  gravity)  [22], 
When  combined  with  a  kinematic  model,  this  information 
can  be  exploited  to  produce  an  estimate  of  trajectory.  For 
the  purposes  of  this  illustration,  the  error  dynamics  can  be 
sufficiently  modeled  using  the  following  method. 

When  target  motion  was  assumed  to  be  effectively  station¬ 


where  <jjcnc  is  the  true  angular  rotation  rate  and  vc  is  the  true 
velocity.  The  tilde  represents  the  corrupted  measurement  as 
received  from  the  inertial  sensor.  The  inertial  measurement 
errors,  Su>  and  Svc  can  be  represented  as  random  vectors 
with  the  following  statistics  over  the  interval  Ts: 


E 


E[&4< 

S^ncS^Cnc 

E  [<5v‘ 


E 


<5vc<5vc 


03X3  (51) 

Qw  (52) 

03x3  (53) 

«  +  qaTs )  I3x3  (54) 


The  gyroscopic  and  accelerometer  error  sources  are  as¬ 
sumed  to  be  collectively  independent.  Substituting  the  ve¬ 
locity  and  angle  increment  measurements  from  the  iner¬ 
tial  sensor  algorithm  into  the  pixel  motion  equations  from 
Eqns.  (30)  and  (31)  and  integrating  the  error  terms  results 
in  the  residual  pixel  motion  error  rate  due  to  inertial  mea- 


surement  errors: 


results  in  the  following  pixel  uncertainties 
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where  8u  and  5v  are  the  random  pixel  location  errors  rates 
in  the  x  and  y  directions,  respectively.  The  standard  devia¬ 
tion  of  the  residual  pixel  errors  is  given  by  calculating  the 
variance  of  pixel  errors  after  integrating  over  an  interval  of 
Ts,  yielding: 


CTu  =  ^x[Tsqw]1/2  (62) 

(TV  =  ^[Tsdw]1'2  (63) 

Ay 

Applying  a  3  —  a  bound  to  the  prediction  errors  results  in 
the  following  temporal  sampling  constraint 

3<rKmax  =  3  max  |  ^  [Tsqw] 1/2  , 

■^[Tsqw]1/2  j<l  (64) 

3 vKmax  =  3/o  [TsQw]1/2 
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The  temporal  sampling  constraint  can  be  applied  in  a  simi¬ 
lar  manner  as  before,  however  in  this  case  the  constraint  is 
applied  to  the  standard  deviation  of  residual  error  of  pixel 
motion  versus  the  total  pixel  motion  considered  in  the  un¬ 
aided  case. 

GKmax  =  max  { au ,  av }  (59) 

Enforcing  the  temporal  sampling  constraint  on  the  residual 
random  pixel  motion  requires  selecting  a  confidence  inter¬ 
val  such  that  the  residual  pixel  motion  is  constrained  to  less 
than  one  pixel  uncertainty.  This  can  be  accomplished  by 
evaluating  the  resulting  probability  distribution  function  of 
the  residual  pixel  errors. 

The  preceding  development  is  illustrated  using  a  simple  ex¬ 
ample.  In  this  example,  a  consumer-grade  inertial  sensor  is 
available  with  the  following  random  walk  parameters: 

rnd2 

qw  =  4.2  X  1(T7  -  (60) 
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q„  =  1.9  X  10"5  (61) 
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As  a  further  simplification,  the  pan  components  are  isolated 
by  assuming  relatively  distant  targets  (e.g.,  scz  — ■>  oo).  This 


Solving  for  the  sampling  interval  yields 

Ts  <  Q  ,2 —  min  {A;r2,  Ay2}  (66) 
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Substituting  the  previously  presented  camera  and  inertial 
parameters  yields: 

Ts~Ym  (sec)  (67) 

which  results  in  a  minimum  frame  rate  of  7.03  Hz  and  max¬ 
imum  exposure  time  of  142.3  ms. 

This  illustration  shows  the  benefits  possible  when  uti¬ 
lizing  inertial  measurements  to  reduce  temporal  aliasing. 
This  result  implies  that  as  long  as  the  rotational  motion 
is  within  the  bandwidth  of  the  inertial  sensor,  sampling 
at  >  7.03  Hertz  will  give  acceptable  anti-aliased  results. 
Thus,  when  incorporating  inertial  measurements  for  fea¬ 
ture  anti-aliasing,  the  sampling  rate  is  independent  of  cam¬ 
era  motion. 

CONCLUSIONS  AND  FUTURE  WORK 

In  this  paper,  the  concepts  relating  spatial  and  temporal  im¬ 
age  sampling  are  explored  from  first  principles  with  focus 
on  the  consequences  for  the  correspondence  search  and  fea¬ 
ture  tracking  problem.  The  sampling  theory  is  developed 
and  shown  to  yield  a  natural  interrelationship  between  ac¬ 
ceptable  spatial  and  temporal  sampling  frequencies.  The 
relationships  between  apparent  feature  motion  and  tempo¬ 
ral  sampling  requirements  are  shown  to  require  very  high 
(possibly  unattainable)  temporal  sampling  rates  in  order  to 
guarantee  un-aliased  sampling.  We  believe  this  is  an  un¬ 
derlying  cause  which  forces  designers  to  exploit  compli¬ 
cated  feature  tracking  algorithms,  which,  in  essence,  can 
be  viewed  as  sophisticated  anti-aliasing  filters.  A  case  in 


point  is  the  operation  of  the  optical  mouse.  It  uses  a  corre¬ 
lation  algorithm  and  assumes  a  linear  (x-y)  planar  motion. 
This,  in  turn,  simplifies  the  computations  and  allows  for 
the  use  of  a  very  high  sampling  rate:  evidently,  the  alias¬ 
ing/ambiguity  issue  is  well  appreciated  because  the  sam¬ 
pling  rate  used  is  1.8  kHz  [2], 

Once  the  problem  is  posed  from  this  perspective,  the  in¬ 
corporation  of  inertial  sensors  is  a  natural  choice.  Iner¬ 
tial  sensors  are  shown  to  have  the  capability  to  statisti¬ 
cally  constrain  the  apparent  motion  effects,  which  can  re¬ 
sult  in  a  significant  reduction  in  required  temporal  sam¬ 
pling  rates  while  alleviating  the  burden  of  feature  corre¬ 
spondence  search.  In  essence,  inertial  sensors  are  proposed 
to  provide  us  with  a  direct  method  for  reducing  or  eliminat¬ 
ing  temporal  aliasing,  allowing  for  the  use  of  sophisticated 
and  efficient/robust  correspondence  search  algorithms  and 
operation  under  lower-lighting  conditions. 

Indeed,  the  use  of  inertial  measurements  for  aiding  the  fea¬ 
ture  correspondence  search  task  is  akin  to  the  use  of  inertial 
measurements  in  ultra-tightly  coupled  GPS  and  INS  where 
the  inertial  information  is  used  to  steer  the  phase-locked 
loops  in  a  feed-forward  mechanization.  This  facilitates  pre¬ 
cise  code-tracking  under  dynamic  conditions  -  a  powerful 
combination  of  precision  and  robustness  which  is  the  hall¬ 
mark  of  properly  fused  synergistic  sensors  [15]. 

There  are  a  number  of  issues  which  require  further  work 
and  development.  First,  applying  statistical  constraints 
from  inertial  sensors  requires  some  knowledge  of  the  scene 
to  properly  account  for  translational  motion.  We  propose 
to  address  this  issue  by  incorporating  statistical  knowledge 
of  the  terrain  (either  a  priori  or  in  situ)  which  could  be  ap¬ 
plied  dynamically  to  either  control  temporal  sampling  rate 
or  to  exclude  features  for  which  aliasing  is  predicted. 

Secondly,  this  development  does  not  exploit  any  geomet¬ 
ric  constraints  regarding  the  scene  itself.  In  certain  cases, 
(e.g.,  an  aircraft  imaging  a  relatively  flat  scene)  the  tem¬ 
poral  sampling  rate  can  be  reduced  below  the  worst-case 
threshold  presented  in  this  paper. 

Ultimately,  we  believe  this  theory  demonstrates  the  com¬ 
plimentary  nature  of  imaging  and  inertial  sensors.  As 
such,  properly  incorporating  inertial  sensors  can  be  a  ma¬ 
jor  advantage  in  developing  robust  image  tracking  appli¬ 
cations  within  reasonable  imaging  and  image  processing 
constraints. 

DISCLAIMER 

The  views  expressed  in  this  article  are  those  of  the  au¬ 
thor  and  do  not  reflect  the  official  policy  or  position  of  the 
United  States  Air  Force,  Department  of  Defense,  or  the  U.S 
Government. 
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